Introduction and business problem

Brooklyn is one of the most diverse places in New York. Here live over 30% of NY population. However, when it comes to vegetarian food, not many places offer good deals. Vegetarianism is more welcome among young population. In this project I’ll come up with some possible location options for the venue.

The problem to solve:

Open a vegetarian restaurant of the mid-range price category in Brooklyn. In specific, I decided to take a baseline location point the NYC College of Technology. The place is close to subway stations (as will be presented below), and has not got many dining venues so far. Therefore, the stakeholders are mostly young people preferring vegetarian food and local employees in vicinity.

Data

In the framework of this project, I’ll be using several data sources, including:

These will be used to build the neighborhood profile in vicinity and better identify if the area worth starting a business at. For instance, NYC real estate data will give the first insights on the price range to buy a place if needed, as well as will give some insights on the solvency of the local population. While foursquare data will help build the profile of the restaurants in vicinity and identify some potential competitors. University locations will help determine if the starting point location (to be discussed in the main report methodology section) presents the point of university agglomeration. Subway stations will be used to illustarte good commute options.

From the sources above, with exception of foursqaure, was downloaded as shapefiles and, since not everywhere JSON option was working, lateron converted to csv using mygeodata service: https://mygeodata.cloud/converter/shp-to-csv

Methodology

The purpose of this project is to identify the possible locations for a vegetarian restaurant in Brooklyn, NYC. The project presents an active use of spatial data from NYU Spatial data repository.

The initial prespotion for research is mid-range real estate pricings bring mid-range restaurants category. This assumption was tested in the course of research. At the first stage, the geodata for NY real estate will be taken and segmented (using k-means clustering) to map the distribution of building types sold in 2016. The data is to some extent outdated, yet for the purposes of this project, given no major price shifts in the previous years, the data will show the structural distribution of the sales objects.

Once done, the clusters will be mapped using the folio library.

Then, we'll use the foursquare data to get the list and details of some venues in vicinity of 2 km. This will allow us to have a more comoprehensive look and build the places profile based on their price tier, location, tips, likes, rating and other similar charachteristics. Once done, the second clastrization will be made, this time of the venues in question. This step will help better locate the competitors. The data will be illustarted visually respecitvely.

On the third stage I'll examine if any other educational instituions are in vicinity and how good the commute is. Once that all is put together, some street names for possible restaurant openning will be presented.

The starting point for the location analysis is NYC College of Technology. This place is taken with the view to reaching yough people and employees who work nearby, thus covering the target audience.

In [1]:
#import necessary libraries 
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import numpy as np

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')
Libraries imported.
In [2]:
with open('ny_data/newyork_data.json') as json_data:
    newyork_data = json.load(json_data)
In [3]:
newyork_data['features'][0]
Out[3]:
{'type': 'Feature',
 'id': 'nyu_2451_34572.1',
 'geometry': {'type': 'Point',
  'coordinates': [-73.84720052054902, 40.89470517661]},
 'geometry_name': 'geom',
 'properties': {'name': 'Wakefield',
  'stacked': 1,
  'annoline1': 'Wakefield',
  'annoline2': None,
  'annoline3': None,
  'annoangle': 0.0,
  'borough': 'Bronx',
  'bbox': [-73.84720052054902,
   40.89470517661,
   -73.84720052054902,
   40.89470517661]}}
In [4]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
In [5]:
for data in newyork_data['features']:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)
In [6]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]) )
The dataframe has 5 boroughs and 306 neighborhoods.
In [7]:
neighborhoods['Borough'].unique()
Out[7]:
array(['Bronx', 'Manhattan', 'Brooklyn', 'Queens', 'Staten Island'],
      dtype=object)
In [8]:
brooklyn_data = neighborhoods[neighborhoods['Borough'] == 'Brooklyn'].reset_index(drop=True)
brooklyn_data.head()
Out[8]:
Borough Neighborhood Latitude Longitude
0 Brooklyn Bay Ridge 40.625801 -74.030621
1 Brooklyn Bensonhurst 40.611009 -73.995180
2 Brooklyn Sunset Park 40.645103 -74.010316
3 Brooklyn Greenpoint 40.730201 -73.954241
4 Brooklyn Gravesend 40.595260 -73.973471
In [9]:
#open the real estate data file

with open('ny_data/nyu-2451-34678-geojson.json') as json_data:
    newyork_sale_ft_data = json.load(json_data)
In [10]:
sales_data = json_normalize(newyork_sale_ft_data['features'])
In [11]:
sales_data.columns[0].split('.')[1]
Out[11]:
'coordinates'
In [12]:
columns = sales_data.columns
cols = []
for i in columns:
    if len(i.split('.')) > 1:
        cols.append(i.split('.')[1])
    else:
        cols.append(i.split('.')[0])
        
sales_data.columns = cols
cols
Out[12]:
['coordinates',
 'type',
 'geometry_name',
 'id',
 'address',
 'apt',
 'bbl_id',
 'bbox',
 'bldg_cls_p',
 'bldg_cls_s',
 'bldg_ctgy',
 'block',
 'borough',
 'com_unit',
 'easmnt',
 'georesult',
 'land_sqft',
 'lat',
 'long',
 'lot',
 'message',
 'nbhd',
 'price',
 'res_unit',
 'sale_date',
 'sale_id',
 'tax_cls_p',
 'tax_cls_s',
 'tot_sqft',
 'tot_unit',
 'usable',
 'year',
 'yr_built',
 'zip',
 'type']
In [13]:
sales_data.sample()
Out[13]:
coordinates type geometry_name id address apt bbl_id bbox bldg_cls_p bldg_cls_s bldg_ctgy block borough com_unit easmnt georesult land_sqft lat long lot message nbhd price res_unit sale_date sale_id tax_cls_p tax_cls_s tot_sqft tot_unit usable year yr_built zip type
81703 [-74.10749160996673, 40.55987673335868] Point geom nyu_2451_34678.81704 278 FINLEY AVENUE None 5407381 [-74.10749160996673, 40.55987673335868, -74.10... A5 A5 01 ONE FAMILY DWELLINGS 4073 5 0 None Address Match 2400 40.559877 -74.107492 81 None NEW DORP-BEACH 111000 1 09/17/2015 81704 1 1 2008 1 Y 2015 1983 10306 Feature

Clustering the neighbourhoods on real estate sales

Let's first clasterize neighbourhoods of Brooklyn by real estate sales data to get some first insights on the neighbourhood economic activity

In data, borough 3 is Brooklyn, and data will be taken only where sale price > 10 USD, i.e. usuable buildings.

In [14]:
brooklyn_property_sales = sales_data[(sales_data['borough'] == 3) & (sales_data['usable'] == 'Y')] 

brooklyn_property_sales = brooklyn_property_sales[['borough','nbhd', 'address', 'zip', 'lat', 'long', 'bldg_ctgy', 'bldg_cls_s', 'tax_cls_s', 'land_sqft','price']]

brooklyn_property_sales.bldg_cls_s.value_counts()

# make onehot encodding for selected columns

brooklyn_property_sales = pd.get_dummies(brooklyn_property_sales, prefix='CAT_', columns= ['bldg_ctgy', 'bldg_cls_s',] )


brooklyn_property_sales.head()
Out[14]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9
6291 3 BATH BEACH 8647 15TH AVENUE 11228 40.610414 -74.010528 1 1547 758000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6292 3 BATH BEACH 55 BAY 10TH STREET 11228 40.609857 -74.009897 1 1933 778000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6294 3 BATH BEACH 1906 86TH STREET 11214 40.605798 -74.000248 1 1900 1365000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6295 3 BATH BEACH 50 BAY 23RD STREET 11214 40.604094 -74.000011 1 2417 750000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6296 3 BATH BEACH 1964 86TH STREET 11214 40.604966 -73.998862 1 1725 1470000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [15]:
address = 'Brooklyn, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate Brooklyn borough are {}, {}.'.format(latitude, longitude))
The geograpical coordinate Brooklyn borough are 40.6501038, -73.9495823.
In [16]:
# create map of Brooklyn using latitude and longitude values
map_brooklyn = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(brooklyn_property_sales['lat'], brooklyn_property_sales['long'], brooklyn_property_sales['nbhd']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brooklyn)  


#add New York City College of Technology to the map (big dark circle)
label = folium.Popup('New York City College of Technology', parse_html=True)
folium.CircleMarker(
        [40.695457, -73.9864678851903],
        radius=20,
        popup=label,
        color='dark green',
        fill=True,
        fill_color='dark green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brooklyn)     
    
map_brooklyn
Out[16]:
In [17]:
brooklyn_property_sales_cluster = brooklyn_property_sales.iloc[:,6:]
brooklyn_property_sales_cluster.head()
Out[17]:
tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9
6291 1 1547 758000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6292 1 1933 778000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6294 1 1900 1365000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6295 1 2417 750000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6296 1 1725 1470000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
In [18]:
from sklearn.cluster import KMeans
from sklearn import metrics
from scipy.spatial.distance import cdist
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

# scale the data and make clusterization
se = StandardScaler()


X = se.fit_transform(brooklyn_property_sales_cluster)

# k means determine k
distortions = []
K = range(1,10)
for k in K:
    kmeanModel = KMeans(n_clusters=k).fit(X)
    kmeanModel.fit(X)
    distortions.append(sum(np.min(cdist(X, kmeanModel.cluster_centers_, 'euclidean'), axis=1)) / X.shape[0])

# Plot the elbow
plt.plot(K, distortions, 'bx-')
plt.xlabel('k')
plt.ylabel('Distortion')
plt.title('The Elbow Method showing the optimal k')
plt.show()
/usr/local/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype uint8, int64, object were all converted to float64 by StandardScaler.
  return self.partial_fit(X, y)
/usr/local/anaconda3/lib/python3.7/site-packages/sklearn/base.py:464: DataConversionWarning: Data with input dtype uint8, int64, object were all converted to float64 by StandardScaler.
  return self.fit(X, **fit_params).transform(X)
<Figure size 640x480 with 1 Axes>
In [19]:
# let's stick with 5 clusters

kclusters = 5

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(X)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
Out[19]:
array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)
In [20]:
len(kmeans.labels_)
Out[20]:
14797
In [21]:
brooklyn_property_sales['clusters'] = kmeans.labels_
In [22]:
brooklyn_property_sales['clusters'] = brooklyn_property_sales['clusters'].dropna().astype(int)
In [23]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(brooklyn_property_sales['lat'], brooklyn_property_sales['long'], brooklyn_property_sales['nbhd'], brooklyn_property_sales['clusters']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

    
#add New York City College of Technology to the map (big dark circle)  
label = folium.Popup('New York City College of Technology', parse_html=True)
folium.CircleMarker(
        [40.695457, -73.9864678851903],
        radius=20,
        popup=label,
        color='dark green',
        fill=True,
        fill_color='dark green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_clusters)    
       
map_clusters
Out[23]:

Analyzing the clusters

In [24]:
brooklyn_property_sales['clusters'].value_counts()
Out[24]:
0    7297
4    5346
1    2083
2      62
3       9
Name: clusters, dtype: int64

Cluster 0

In [25]:
brooklyn_property_sales[brooklyn_property_sales['clusters'] == 0][:10]
Out[25]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9 clusters
6499 3 BATH BEACH 8674 17TH AVENUE 11214 40.607297 -74.006220 2 2432 1250000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6504 3 BATH BEACH 15 BAY 23RD STREET 11214 40.604558 -73.999499 2 2417 1330000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6505 3 BATH BEACH 8672 21ST AVENUE 11214 40.602010 -73.997389 2 2433 1380000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6507 3 BATH BEACH 45 BAY 28TH STREET 11214 40.601928 -73.996237 2 5800 5040000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6509 3 BATH BEACH 8717 21ST AVENUE 11214 40.600874 -73.998542 2 6767 3195000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6510 3 BATH BEACH 8758 BAY PARKWAY 11214 40.598953 -73.996936 2 9691 15150000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6514 3 BATH BEACH 1859-1861 CROPSEY AVENUE 11214 40.601738 -74.005708 2 3814 1000000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6515 3 BATH BEACH 194 BAY 22ND STREET 11214 40.601645 -74.003720 2 5051 3100000 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6516 3 BATH BEACH 141 BAY 20TH, 3 11214 40.603632 -74.004044 2 0 350000 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6517 3 BATH BEACH 139 BAY 20TH, 1 11214 40.603663 -74.004015 2 0 420000 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Cluster 1

In [26]:
brooklyn_property_sales[brooklyn_property_sales['clusters'] == 1][:10]
Out[26]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9 clusters
6291 3 BATH BEACH 8647 15TH AVENUE 11228 40.610414 -74.010528 1 1547 758000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6292 3 BATH BEACH 55 BAY 10TH STREET 11228 40.609857 -74.009897 1 1933 778000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6294 3 BATH BEACH 1906 86TH STREET 11214 40.605798 -74.000248 1 1900 1365000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6295 3 BATH BEACH 50 BAY 23RD STREET 11214 40.604094 -74.000011 1 2417 750000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6296 3 BATH BEACH 1964 86TH STREET 11214 40.604966 -73.998862 1 1725 1470000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6297 3 BATH BEACH 1970 86TH STREET 11214 40.604914 -73.998776 1 1725 1790000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6298 3 BATH BEACH 160 BAY 10TH STREET 11228 40.607955 -74.011906 1 2469 920000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6299 3 BATH BEACH 1638 BENSON AVENUE 11214 40.607659 -74.007902 1 3625 820000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6305 3 BATH BEACH 239 BAY8TH STREET 11228 40.607224 -74.015079 1 5810 1850000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
6306 3 BATH BEACH 8804 17TH AVENUE 11214 40.604516 -74.009115 1 3394 1192000 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

Cluster 2

In [27]:
brooklyn_property_sales[brooklyn_property_sales['clusters'] == 2][:10]
Out[27]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9 clusters
6557 3 BATH BEACH 1934 BATH AVENUE 11214 40.602093 -74.002769 2 2160 1200000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
7457 3 BAY RIDGE 8617 4 AVENUE 11209 40.622327 -74.028533 2 1720 108000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
7458 3 BAY RIDGE 8615 4 AVENUE 11209 40.622373 -74.028515 2 1859 108000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
8916 3 BEDFORD STUYVESANT 912 GATES AVENUE 11221 40.688822 -73.927879 2 1580 1100000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
8925 3 BEDFORD STUYVESANT 529 MARCY AVENUE 11206 40.697131 -73.949548 2 2025 967698 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
8930 3 BEDFORD STUYVESANT 868 DEKALB AVENUE 11221 40.692518 -73.941322 2 2000 1450000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
11041 3 BOROUGH PARK 4023 13 AVENUE 11218 40.639890 -73.987046 2 1615 990000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
11045 3 BOROUGH PARK 5401 8 AVENUE 11220 40.638788 -74.006122 2 1613 4850000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
11048 3 BOROUGH PARK 5917 10TH AVENUE 11219 40.633254 -74.004702 2 2000 1500000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2
11051 3 BOROUGH PARK 6005 FORT HAMILTON PARKWA 11219 40.632988 -74.005692 2 1495 2200000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2

Cluster 3

In [28]:
brooklyn_property_sales[brooklyn_property_sales['clusters'] == 3][:10]
Out[28]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9 clusters
9045 3 BEDFORD STUYVESANT 186 FRANKLIN AVENUE 11205 40.693121 -73.957758 2 0 5000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
9046 3 BEDFORD STUYVESANT 452 LAFAYETTE AVENUE 11205 40.689237 -73.956852 2 0 325840 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
10135 3 BOERUM HILL 438 ATLANTIC AVENUE 11217 40.686494 -73.983727 2 0 3250000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
11169 3 BOROUGH PARK 1441 54TH STREET 11219 40.630323 -73.991858 2 0 566637 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
11172 3 BOROUGH PARK 857 60TH STREET 11220 40.634769 -74.008341 2 0 906620 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
12042 3 BROOKLYN HEIGHTS 253 HENRY STREET 11201 40.692750 -73.995262 2 0 500000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
13134 3 BUSHWICK 245 HARMAN STREET 11237 40.698901 -73.919870 2 0 1260000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
26055 3 PARK SLOPE SOUTH 443 7 AVENUE 11215 40.663130 -73.984706 2 0 299000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3
28361 3 WILLIAMSBURG-EAST 629 GRAND STREET 11211 40.711372 -73.946846 2 0 1313568 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3

Cluster 4

In [29]:
brooklyn_property_sales[brooklyn_property_sales['clusters'] == 4][:10]
Out[29]:
borough nbhd address zip lat long tax_cls_s land_sqft price CAT__01 ONE FAMILY DWELLINGS CAT__02 TWO FAMILY DWELLINGS CAT__03 THREE FAMILY DWELLINGS CAT__04 TAX CLASS 1 CONDOS CAT__05 TAX CLASS 1 VACANT LAND CAT__06 TAX CLASS 1 - OTHER CAT__07 RENTALS - WALKUP APARTMENTS CAT__08 RENTALS - ELEVATOR APARTMENTS CAT__09 COOPS - WALKUP APARTMENTS CAT__10 COOPS - ELEVATOR APARTMENTS CAT__11 SPECIAL CONDO BILLING LOTS CAT__11A CONDO-RENTALS CAT__12 CONDOS - WALKUP APARTMENTS CAT__13 CONDOS - ELEVATOR APARTMENTS CAT__14 RENTALS - 4-10 UNIT CAT__15 CONDOS - 2-10 UNIT RESIDENTIAL CAT__16 CONDOS - 2-10 UNIT WITH COMMERCIAL UNIT CAT__17 CONDO COOPS CAT__18 TAX CLASS 3 - UNTILITY PROPERTIES CAT__21 OFFICE BUILDINGS CAT__22 STORE BUILDINGS CAT__23 LOFT BUILDINGS CAT__27 FACTORIES CAT__28 COMMERCIAL CONDOS CAT__29 COMMERCIAL GARAGES CAT__30 WAREHOUSES CAT__31 COMMERCIAL VACANT LAND CAT__32 HOSPITAL AND HEALTH FACILITIES CAT__33 EDUCATIONAL FACILITIES CAT__35 INDOOR PUBLIC AND CULTURAL FACILITIES CAT__36 OUTDOOR RECREATIONAL FACILITIES CAT__37 RELIGIOUS FACILITIES CAT__38 ASYLUMS AND HOMES CAT__39 TRANSPORTATION FACILITIES CAT__41 TAX CLASS 4 - OTHER CAT__42 CONDO CULTURAL/MEDICAL/EDUCATIONAL/ETC CAT__43 CONDO OFFICE BUILDINGS CAT__44 CONDO PARKING CAT__46 CONDO STORE BUILDINGS CAT__47 CONDO NON-BUSINESS STORAGE CAT__48 CONDO TERRACES/GARDENS/CABANAS CAT__49 CONDO WAREHOUSES/FACTORY/INDUS CAT__A1 CAT__A2 CAT__A3 CAT__A4 CAT__A5 CAT__A9 CAT__B1 CAT__B2 CAT__B3 CAT__B9 CAT__C0 CAT__C1 CAT__C2 CAT__C3 CAT__C4 CAT__C5 CAT__C6 CAT__C7 CAT__C8 CAT__C9 CAT__D0 CAT__D1 CAT__D3 CAT__D4 CAT__D5 CAT__D6 CAT__D7 CAT__D8 CAT__D9 CAT__E1 CAT__E2 CAT__E3 CAT__E7 CAT__E9 CAT__F1 CAT__F2 CAT__F4 CAT__F5 CAT__F8 CAT__F9 CAT__G0 CAT__G1 CAT__G2 CAT__G3 CAT__G4 CAT__G5 CAT__G6 CAT__G7 CAT__G8 CAT__G9 CAT__GU CAT__I1 CAT__I4 CAT__I5 CAT__I6 CAT__I7 CAT__I9 CAT__K1 CAT__K2 CAT__K4 CAT__K5 CAT__K9 CAT__L9 CAT__M1 CAT__M2 CAT__M3 CAT__M4 CAT__M9 CAT__N2 CAT__N9 CAT__O1 CAT__O2 CAT__O5 CAT__O6 CAT__O7 CAT__O8 CAT__O9 CAT__P3 CAT__P5 CAT__P6 CAT__Q9 CAT__R0 CAT__R1 CAT__R2 CAT__R3 CAT__R4 CAT__R5 CAT__R6 CAT__R8 CAT__R9 CAT__RA CAT__RB CAT__RG CAT__RK CAT__RP CAT__RR CAT__RS CAT__RT CAT__RW CAT__S0 CAT__S1 CAT__S2 CAT__S3 CAT__S4 CAT__S5 CAT__S9 CAT__T9 CAT__U7 CAT__V0 CAT__V1 CAT__V2 CAT__V3 CAT__V9 CAT__W1 CAT__W2 CAT__W3 CAT__W8 CAT__W9 CAT__Z0 CAT__Z9 clusters
6322 3 BATH BEACH 8661 14TH AVENUE 11228 40.611605 -74.012876 1 2320 980000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6323 3 BATH BEACH 70 BAY 7TH STREET 11228 40.611048 -74.012271 1 2320 750000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6324 3 BATH BEACH 37 BAY 7 STREET 11228 40.611504 -74.011770 1 2320 970000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6325 3 BATH BEACH 35 BAY 7TH ST 11228 40.611531 -74.011741 1 2320 855000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6326 3 BATH BEACH 35 BAY 7TH STREET 11228 40.611531 -74.011741 1 2320 855000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6330 3 BATH BEACH 31 BAY 10TH STREET 11228 40.610305 -74.009429 1 2103 840000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6332 3 BATH BEACH 8653 16TH AVENUE 11214 40.609064 -74.008352 1 2058 769000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6335 3 BATH BEACH 18 BAY 14TH STREET 11214 40.608658 -74.006148 1 3033 1120075 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6336 3 BATH BEACH 8659 17TH AVENUE 11214 40.607494 -74.005986 1 1933 375000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4
6337 3 BATH BEACH 8643 17TH AVENUE 11214 40.607725 -74.005744 1 1933 870000 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4

Having analyzed the clusters by value counts of the category values, here's my interptretation of each of them:

cluster 0, COOPS - ELEVATOR APARTMENTS, CONDOS - ELEVATOR APARTMENTS and RENTALS - WALKUP APARTMENTS with price between 550,000 USD and 1,200,000 USD. The most populous cluster

cluster 1, ONE FAMILY DWELLINGS, with price varying between 450,000 USD and 800,000 USD on average

cluster 2, RENTALS, with price varying between 150,000 USD and 500,000 USD on average

cluster 3, CONDOS, with varying price, just several instances

cluster 4, ONE FAMILY DWELLINGS to THREE FAMILY DWELLINGS, with price between roughly 650,000 USD and 1,200,000 USD


In [ ]:
 

Also, let's get the cluster distribution from the brooklyn neighborhoods and pivot it to get the list of unique neighborhoods. Then, we'll merge these data with brooklyn_data file containing the coordinates and neighborhood names.

In [30]:
brooklyn_property_pivot = brooklyn_property_sales.pivot_table(index = 'nbhd', columns = 'clusters', values = 'zip', aggfunc = 'count')
brooklyn_property_pivot.index = brooklyn_property_pivot.index.str.title()
brooklyn_property_pivot.head()
Out[30]:
clusters 0 1 2 3 4
nbhd
Bath Beach 55.0 20.0 1.0 NaN 110.0
Bay Ridge 375.0 134.0 2.0 NaN 163.0
Bedford Stuyvesant 431.0 44.0 3.0 2.0 527.0
Bensonhurst 57.0 42.0 NaN NaN 158.0
Bergen Beach 16.0 42.0 NaN NaN 66.0
In [31]:
# merge with brooklyn_data. Since the number of nbhds is slighly lower on the propertt sales data, the inner join will cut some of them in the resultsing df.

brooklyn_data_merged = brooklyn_data.merge(brooklyn_property_pivot, left_on = 'Neighborhood', right_on = 'nbhd')
cols = ['Borough', 'Neighborhood', 'Latitude', 'Longitude', 'sc_0', 'sc_1', 'sc_2', 'sc_3', 'sc_4']
brooklyn_data_merged.columns = cols
brooklyn_data_merged.head()
Out[31]:
Borough Neighborhood Latitude Longitude sc_0 sc_1 sc_2 sc_3 sc_4
0 Brooklyn Bay Ridge 40.625801 -74.030621 375.0 134.0 2.0 NaN 163.0
1 Brooklyn Bensonhurst 40.611009 -73.995180 57.0 42.0 NaN NaN 158.0
2 Brooklyn Sunset Park 40.645103 -74.010316 186.0 17.0 NaN NaN 177.0
3 Brooklyn Greenpoint 40.730201 -73.954241 192.0 10.0 7.0 NaN 64.0
4 Brooklyn Gravesend 40.595260 -73.973471 132.0 74.0 1.0 NaN 179.0
  • Having clustered Brooklyn on real estate data and cluster distribution by neighborhood, we see that our starting point, New York City College of Technology, is located in in cluster 0.
  • This corresponds to mid to high range price of real estate sales. Just what's needed!

Foursquare database

In [95]:
CLIENT_ID = '-' # your Foursquare ID
CLIENT_SECRET = '-' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
Your credentails:
CLIENT_ID: -
CLIENT_SECRET:-
In [33]:
def getNearbyVenues(latitudes, longitudes, radius=2000, LIMIT = 300):
    
    venues_list=[]    
       
    # create the API request URL - restaurants
    category = '4d4b7105d754a06374d81259'


    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        CLIENT_ID, 
        CLIENT_SECRET, 
        VERSION, 
        neighborhood_latitude, 
        neighborhood_longitude,
        category,
    
        radius, 
        LIMIT)
      
      # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
      
      # return only relevant information for each nearby venue
    venues_list.append([(
            v['venue']['id'], 
            v['venue']['name'],
            v['venue']['categories'][0]['name'],
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],
            v['venue']['location']['distance']          
            ) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [ 
                  'ID',
                  'Venue', 
                  'Venue Category',
                  'Venue Latitude', 
                  'Venue Longitude',
                  'Distance from the NYC College of Technology',
                  ]
    
    return(nearby_venues)
In [34]:
# coordinates of the NYC College of Technology 

neighborhood_latitude = 40.695457
neighborhood_longitude =  -73.9864678851903

nb = getNearbyVenues(neighborhood_latitude, neighborhood_longitude)
nb.set_index('ID')
Out[34]:
Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology
ID
594277b586f4cc0f251fc389 DeKalb Market Hall Food Court 40.691250 -73.982579 571
59da9590e1f0aa52976b8f35 Han Dynasty Chinese Restaurant 40.691334 -73.982456 570
5a87210eda5ede7ae86f82c6 Korilla BBQ Asian Restaurant 40.693803 -73.985843 191
53d1a481498e1fac7c37f675 Pollo d'Oro Peruvian Restaurant 40.694823 -73.983373 270
59a3040fda708024d6926771 Brooklyn Bridge Bistro And Winebar Bistro 40.696272 -73.988343 182
4b6b6ad0f964a52066072ce3 Sushi Gallery Sushi Restaurant 40.697595 -73.993236 618
53b36289498e528b6cad6624 Forno Rosso Pizza Place 40.694437 -73.983442 279
5946d138e2da1964625ca9b2 Daigo Hand Roll Bar Japanese Restaurant 40.691259 -73.982603 569
594416bca4ba7c683f5492df A Taste Of Katz's Sandwich Place 40.691378 -73.982426 567
50478b43e4b05a8c89d8198c Sophies Cuban Cuisine Cuban Restaurant 40.690602 -73.987700 550
479ccb47f964a5206b4d1fe3 Iron Chef House Japanese Restaurant 40.697406 -73.992560 558
5949c93c3ba767778ec8fad9 Eight Turn Crêpe Creperie 40.691186 -73.982831 565
52b204c0498efe3c3be32d91 Burrow Bakery 40.702579 -73.986702 793
5c56083dc58ed7002c3255d0 Charm Kao Thai Restaurant 40.689445 -73.986585 669
5c3ebf15c0f163002ce6db6f CAVA Mediterranean Restaurant 40.692461 -73.988529 376
4ac00013f964a520689320e3 Queen Italian Restaurant 40.691319 -73.991647 635
55664aac498e10d4e5769e5e Damascus Bread & Pastry Shop Bakery 40.690047 -73.993054 819
583c7fa58cfe547612b47b93 Westville DUMBO American Restaurant 40.702021 -73.989596 776
54b4026b498e73880a40a8d7 Grand Army Seafood Restaurant 40.688329 -73.986612 793
4e793558aeb79f7dab5c7f6f Sottocasa Pizzeria Pizza Place 40.688534 -73.988947 798
5012043564a4944f5c47738a Dellarocco's Pizza Place 40.694992 -73.995924 799
4dc5a9c8e4cd169dc6532c66 Shelsky's of Brooklyn Bagel Shop 40.689486 -73.992518 838
4dcc0c3052b17cba4fb62352 Five Guys Burger Joint 40.693676 -73.986087 200
530931a5498e4079544a5f13 French Louie French Restaurant 40.688056 -73.988159 836
3fd66200f964a520eae81ee3 Henry's End New American Restaurant 40.699610 -73.991983 656
550c4a17498e1d6612e9dffc Bread & Spread Sandwich Place 40.702541 -73.987082 790
3fd66200f964a520ece81ee3 Noodle Pudding Italian Restaurant 40.699715 -73.991769 651
43eb685af964a520382f1fe3 Yemen Cafe Middle Eastern Restaurant 40.690026 -73.993579 851
51b0ceff2d4e2ee801fd4ab2 Foragers Deli / Bodega 40.702468 -73.988569 800
5942d5c94c9be62191b036fa sweetgreen Salad Place 40.702914 -73.989830 877
49d3773df964a520fb5b1fe3 Vinegar Hill House American Restaurant 40.702686 -73.981260 916
4a635ce2f964a520dbc41fe3 Lassen & Hennigs Deli / Bodega 40.694970 -73.994857 710
59456df7a4ba7c683f19f0e7 Café D'Avignon Café 40.691250 -73.982786 562
4b5e4c9df964a520648829e3 Mile End Delicatessen Sandwich Place 40.687549 -73.987018 881
4a748cd9f964a52092de1fe3 Fast and Fresh Burrito Deli Burrito Place 40.687983 -73.986935 832
4cbf097828d176b0a7c6226e Colonie American Restaurant 40.690733 -73.995963 958
49e63b62f964a52027641fe3 Ki Sushi Sushi Restaurant 40.687574 -73.989936 925
49f9dd95f964a5208d6d1fe3 Wild Ginger Vegetarian / Vegan Restaurant 40.687898 -73.989730 885
49f3261bf964a520696a1fe3 Clark's Restaurant Diner 40.697533 -73.993044 601
4e1e23631838bed7eb6a8792 Bien Cuit Bakery 40.687635 -73.989864 916
4f3265a119836c91c7d3c6f7 Hadramout Restaurant Restaurant 40.689818 -73.993877 886
5c17e5658c35dc002cdc77e5 Chicks Isan Thai Restaurant 40.690758 -73.983387 584
5a0740a3e1f22816d11723d5 Lillo Italian Restaurant 40.690200 -73.996540 1032
585d72449f25836f2b2b7a1b Xifu Food Chinese Restaurant 40.688027 -73.982088 905
50ca6337e4b04e1f3135689c Juliana's Pizza Pizza Place 40.702769 -73.993616 1013
49e13617f964a520a9611fe3 One Girl Cookies Bakery 40.687404 -73.990287 952
48a41073f964a52091511fe3 Hibino Japanese Restaurant 40.690076 -73.996497 1037
5b7c4cb3c58ed7002c1fd9bd Butler Bakeshop Café 40.703295 -73.992526 1011
53cfc343498e6ad2a8cc30f3 Pio Bagel Bagel Shop 40.692159 -73.986368 367
527d8cfc11d2a61a7c663024 Luzzo's BK Pizza Place 40.690555 -73.995333 926
4be30f9dd27a20a1cd1f915b River Deli Italian Restaurant 40.693713 -73.998435 1028
43164480f964a52065271fe3 Bedouin Tent Middle Eastern Restaurant 40.686936 -73.984469 963
4471bf9af964a5209c331fe3 Jack the Horse Tavern American Restaurant 40.699940 -73.993639 784
5945bcb842d8c24270976841 Likkle More Jerk Caribbean Restaurant 40.690743 -73.983388 585
4a9c07d1f964a520ca3520e3 Heights Falafel Falafel Restaurant 40.698446 -73.992508 608
57d74b38498e41fcd6c3ee92 The Gumbo Bros Cajun / Creole Restaurant 40.689526 -73.991730 795
593c0d2262420b7feccc3048 Cecconi's Italian Restaurant 40.703893 -73.991609 1034
4f69f2b76d86f87117bb13ab Gran Eléctrica Mexican Restaurant 40.702570 -73.993096 969
4d9f5a9efc4f721e7e5a9d5f Rucola Italian Restaurant 40.685659 -73.985769 1092
538d22cb498e86974754d6da Shake Shack Burger Joint 40.703043 -73.994276 1071
5b2932a0f5e9d70039787cf2 Los Tacos Al Pastor Taco Place 40.702436 -73.987539 782
580ba89038fa564f8dec12fb Doner Kebab NYC Kebab Restaurant 40.692116 -73.987493 381
5ca3681932b61d0039533330 Pret A Manger French Restaurant 40.693555 -73.985378 230
5330e34e498e36f70456a24b Saketumi Asian Bistro Asian Restaurant 40.694910 -73.994578 687
4ab6d694f964a520407920e3 Henry Public Gastropub 40.690413 -73.996412 1009
4a674aa8f964a5201fc91fe3 La Bagel Delight Bagel Shop 40.691125 -73.991668 652
4f6e5ea9e4b086107787908b La Vara Spanish Restaurant 40.687851 -73.995582 1144
43504680f964a520b0281fe3 Almondine Bakery Bakery 40.703328 -73.991253 964
5b7f53abe727c40024124d8c Oh! Dumplings Dumpling Restaurant 40.687722 -73.992943 1019
595183c718d43b1841112628 Dulcinea Churros & Co. Bakery 40.691364 -73.982421 569
51a3ab412fc69e7654bc731f Luke's Lobster Seafood Restaurant 40.703441 -73.994091 1097
49e28bbcf964a5203a621fe3 Court Street Bagels Bagel Shop 40.688026 -73.993060 996
4ae10c7af964a520db8421e3 My Little Pizzeria Pizza Place 40.690236 -73.992334 763
3fd66200f964a520efe81ee3 The River Café American Restaurant 40.703754 -73.994834 1162
59e7f722829b0c09cfce96d2 Black Forest Brooklyn German Restaurant 40.685613 -73.991097 1163
3fd66200f964a52058f11ee3 Bar Tabac French Restaurant 40.687369 -73.990106 951
51782c87498e1803b70a775f DUMBO Food Truck Lot Food Truck 40.703126 -73.986851 854
56a122d6498eb97a32f045d4 Maison Kayser Bakery 40.692166 -73.991092 535
4db8989a8154ce84dc168e1c two8two Bar & Burger Burger Joint 40.688513 -73.989743 820
572be80d498e66d83ce6f537 Beasts & Bottles New American Restaurant 40.690489 -73.995108 915
4f7f8b86e4b088077df30175 Chez Moi French Restaurant 40.690654 -73.995777 950
58a202a25490d30f87553a08 Rice & Miso Japanese Restaurant 40.684633 -73.983768 1226
4f21bb85e5e872143c0ca04b One Girl Cookies Bakery 40.703317 -73.990694 944
59399dcd12f0a93a5c392d6a Kotti Berliner Döner Kebab Restaurant 40.690662 -73.983533 588
4a5b37fff964a520edba1fe3 La Bagel Delight Bagel Shop 40.702417 -73.988597 795
4a2aebabf964a5206e961fe3 Farmer in the Deli Deli / Bodega 40.693276 -73.971837 1258
51dd7c69498ee00b70fa54b8 Atrium DUMBO American Restaurant 40.703410 -73.990464 947
4112ca00f964a520ed0b1fe3 Joya Thai Restaurant 40.686708 -73.993718 1150
59456723ad910e218eb07f49 Arepa Lady South American Restaurant 40.691182 -73.982785 568
4ab11a2af964a5200b6820e3 Los Papis Latin American Restaurant 40.702087 -73.985044 747
4bafd791f964a5203c253ce3 Mitoushi Sushi Japanese Restaurant 40.690182 -73.993983 864
5250553911d262bb0c732ee8 Cafe Paulette French Restaurant 40.689743 -73.975798 1102
5914d24b3b83070821fcdd87 Miss Ada Israeli Restaurant 40.689560 -73.972351 1360
5739c66c498e7ef6085cec4f Karasu Japanese Restaurant 40.689577 -73.973290 1290
59dd280192e7a97276e4d52f Celestine Mediterranean Restaurant 40.704537 -73.988026 1019
5529a381498e1232f524541f Numero 28 Pizza & Cucina Italian Restaurant 40.686871 -73.990721 1020
49ba5f96f964a52060531fe3 Building on Bond American Restaurant 40.686502 -73.985150 1003
4a9ff488f964a520b73d20e3 Tutt Cafe Middle Eastern Restaurant 40.700285 -73.993422 795
5995a1ec08815835e6db45aa Andrew’s Classic Brooklyn Bagels Bagel Shop 40.690601 -73.983598 592
51fa84d22fc60e9c2e6f875b Fornino Pizza Place 40.692980 -74.001697 1314
In [35]:
nb.shape
Out[35]:
(100, 6)
In [36]:
nb.groupby('Venue Category')['Venue'].count().sort_values(ascending = False)
Out[36]:
Venue Category
Bakery                           8
American Restaurant              7
Pizza Place                      7
Italian Restaurant               7
Bagel Shop                       6
Japanese Restaurant              6
French Restaurant                5
Thai Restaurant                  3
Burger Joint                     3
Middle Eastern Restaurant        3
Sandwich Place                   3
Deli / Bodega                    3
New American Restaurant          2
Mediterranean Restaurant         2
Café                             2
Seafood Restaurant               2
Sushi Restaurant                 2
Chinese Restaurant               2
Asian Restaurant                 2
Restaurant                       2
Diner                            1
Burrito Place                    1
Cajun / Creole Restaurant        1
Bistro                           1
Caribbean Restaurant             1
Creperie                         1
Cuban Restaurant                 1
Vegetarian / Vegan Restaurant    1
Dumpling Restaurant              1
Falafel Restaurant               1
Food Court                       1
Food Truck                       1
Gastropub                        1
Israeli Restaurant               1
Kebab Restaurant                 1
Latin American Restaurant        1
Mexican Restaurant               1
Peruvian Restaurant              1
Salad Place                      1
South American Restaurant        1
Spanish Restaurant               1
Taco Place                       1
German Restaurant                1
Name: Venue, dtype: int64

Now, let's get more details about these places, including the rating and the price tier

In [37]:
details = []

for i in nb.ID:
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}&'.format(
        i,
        CLIENT_ID, 
        CLIENT_SECRET, 
        VERSION 
        )
    results = requests.get(url).json()
    
    try:
        details.append([
                    results['response']['venue']['id'] if results['response']['venue']['id'] else 0,
                    results['response']['venue']['stats']['tipCount'] if results['response']['venue']['stats']['tipCount'] else 0,
                    results['response']['venue']['price']['tier'] if results['response']['venue']['price'] else 0,
                    results['response']['venue']['likes']['count'] if results['response']['venue']['likes']['count'] else 0,
                    results['response']['venue']['rating'] if results['response']['venue']['rating'] else 0 ])
    except KeyError:
        continue

    
    
In [38]:
nearby_venues = pd.DataFrame(details)
nearby_venues.columns = [ 
                  'ID',
                  'Tips_count', 
                  'Price tier', 
                  'Likes_count',
                  'Rating',
                  ]
nearby_venues.set_index('ID')
Out[38]:
Tips_count Price tier Likes_count Rating
ID
594277b586f4cc0f251fc389 96 2 731 9.3
59da9590e1f0aa52976b8f35 30 1 145 8.8
5a87210eda5ede7ae86f82c6 4 2 24 8.0
53d1a481498e1fac7c37f675 15 2 37 8.1
4b6b6ad0f964a52066072ce3 15 2 38 8.8
53b36289498e528b6cad6624 39 1 165 8.1
5946d138e2da1964625ca9b2 10 2 46 8.6
594416bca4ba7c683f5492df 3 1 36 8.5
50478b43e4b05a8c89d8198c 27 1 96 8.4
479ccb47f964a5206b4d1fe3 66 2 140 8.4
52b204c0498efe3c3be32d91 32 1 140 9.2
5c56083dc58ed7002c3255d0 3 2 10 8.7
5c3ebf15c0f163002ce6db6f 2 2 13 8.1
4ac00013f964a520689320e3 45 2 92 8.5
55664aac498e10d4e5769e5e 19 1 82 9.3
583c7fa58cfe547612b47b93 22 2 207 9.0
54b4026b498e73880a40a8d7 80 3 420 9.0
4e793558aeb79f7dab5c7f6f 114 2 416 9.0
5012043564a4944f5c47738a 56 2 172 9.0
4dc5a9c8e4cd169dc6532c66 121 3 306 9.2
4dcc0c3052b17cba4fb62352 38 1 114 7.8
530931a5498e4079544a5f13 112 3 550 9.0
3fd66200f964a520eae81ee3 31 2 87 8.4
550c4a17498e1d6612e9dffc 12 2 81 8.8
3fd66200f964a520ece81ee3 87 2 219 8.4
43eb685af964a520382f1fe3 95 2 333 9.0
51b0ceff2d4e2ee801fd4ab2 97 1 300 8.8
5942d5c94c9be62191b036fa 12 1 89 9.1
49d3773df964a520fb5b1fe3 254 3 691 9.1
4a635ce2f964a520dbc41fe3 48 1 100 8.4
59456df7a4ba7c683f19f0e7 9 1 31 8.1
4b5e4c9df964a520648829e3 310 2 737 8.8
4a748cd9f964a52092de1fe3 63 1 133 8.7
4cbf097828d176b0a7c6226e 233 3 732 9.2
49e63b62f964a52027641fe3 88 3 193 8.9
49f9dd95f964a5208d6d1fe3 60 2 195 8.7
49f3261bf964a520696a1fe3 92 2 211 8.1
4e1e23631838bed7eb6a8792 173 1 532 8.8
4f3265a119836c91c7d3c6f7 6 2 11 8.7
5c17e5658c35dc002cdc77e5 6 2 15 8.1
5a0740a3e1f22816d11723d5 7 2 34 9.2
585d72449f25836f2b2b7a1b 9 1 23 8.6
50ca6337e4b04e1f3135689c 301 2 979 9.1
49e13617f964a520a9611fe3 77 2 152 8.8
48a41073f964a52091511fe3 152 2 460 9.2
5b7c4cb3c58ed7002c1fd9bd 11 1 52 9.0
53cfc343498e6ad2a8cc30f3 22 1 73 7.7
527d8cfc11d2a61a7c663024 44 2 163 8.6
4be30f9dd27a20a1cd1f915b 95 2 292 9.0
43164480f964a52065271fe3 122 1 343 8.7
4471bf9af964a5209c331fe3 88 3 209 8.3
5945bcb842d8c24270976841 4 2 18 8.0
4a9c07d1f964a520ca3520e3 38 1 83 8.0
57d74b38498e41fcd6c3ee92 26 2 85 8.3
593c0d2262420b7feccc3048 65 2 302 9.0
4f69f2b76d86f87117bb13ab 195 3 629 8.7
4d9f5a9efc4f721e7e5a9d5f 227 3 844 9.3
538d22cb498e86974754d6da 168 1 1153 9.1
5b2932a0f5e9d70039787cf2 4 1 37 8.2
5ca3681932b61d0039533330 1 3 1 7.5
5330e34e498e36f70456a24b 6 2 29 8.0
4ab6d694f964a520407920e3 186 3 500 8.6
4a674aa8f964a5201fc91fe3 41 1 84 8.0
4f6e5ea9e4b086107787908b 137 3 472 9.1
43504680f964a520b0281fe3 144 1 262 8.5
5b7f53abe727c40024124d8c 6 1 35 8.6
595183c718d43b1841112628 6 1 17 7.8
51a3ab412fc69e7654bc731f 51 2 192 8.8
49e28bbcf964a5203a621fe3 57 1 165 8.5
4ae10c7af964a520db8421e3 49 1 96 8.1
3fd66200f964a520efe81ee3 158 4 412 8.9
59e7f722829b0c09cfce96d2 10 2 72 8.9
3fd66200f964a52058f11ee3 209 2 438 8.3
51782c87498e1803b70a775f 11 2 83 8.2
56a122d6498eb97a32f045d4 21 1 88 7.7
4db8989a8154ce84dc168e1c 165 2 329 8.1
572be80d498e66d83ce6f537 18 3 97 8.2
4f7f8b86e4b088077df30175 71 2 186 8.2
58a202a25490d30f87553a08 15 2 79 8.9
4f21bb85e5e872143c0ca04b 169 1 503 8.2
59399dcd12f0a93a5c392d6a 7 2 27 7.7
4a5b37fff964a520edba1fe3 66 1 173 7.9
4a2aebabf964a5206e961fe3 24 1 63 8.9
51dd7c69498ee00b70fa54b8 98 3 330 8.1
4112ca00f964a520ed0b1fe3 158 2 309 8.5
59456723ad910e218eb07f49 4 2 29 7.7
4ab11a2af964a5200b6820e3 22 1 44 7.8
4bafd791f964a5203c253ce3 30 2 47 8.0
5250553911d262bb0c732ee8 50 2 201 8.4
5914d24b3b83070821fcdd87 44 2 227 9.3
5739c66c498e7ef6085cec4f 32 3 179 8.9
59dd280192e7a97276e4d52f 22 2 104 8.2
5529a381498e1232f524541f 24 2 79 8.2
49ba5f96f964a52060531fe3 219 2 563 8.2
4a9ff488f964a520b73d20e3 33 1 69 7.9
5995a1ec08815835e6db45aa 3 1 15 7.6
51fa84d22fc60e9c2e6f875b 79 2 340 8.8

Price tier

From the foursquare description, currently the valid range of price points are [1,2,3,4], 1 being the least expensive, 4 being the most expensive. For food venues, in the United States, 1 is < 10 USD an entree, 2 is 10-20 USD an entree, 3 is 20-30USD an entree, 4 is > 30 USD an entree

In [39]:
nb_merged = nb.merge(nearby_venues, right_index = True, left_index = True).drop('ID_y',axis = 1)
nb_merged
Out[39]:
ID_x Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology Tips_count Price tier Likes_count Rating
0 594277b586f4cc0f251fc389 DeKalb Market Hall Food Court 40.691250 -73.982579 571 96 2 731 9.3
1 59da9590e1f0aa52976b8f35 Han Dynasty Chinese Restaurant 40.691334 -73.982456 570 30 1 145 8.8
2 5a87210eda5ede7ae86f82c6 Korilla BBQ Asian Restaurant 40.693803 -73.985843 191 4 2 24 8.0
3 53d1a481498e1fac7c37f675 Pollo d'Oro Peruvian Restaurant 40.694823 -73.983373 270 15 2 37 8.1
4 59a3040fda708024d6926771 Brooklyn Bridge Bistro And Winebar Bistro 40.696272 -73.988343 182 15 2 38 8.8
5 4b6b6ad0f964a52066072ce3 Sushi Gallery Sushi Restaurant 40.697595 -73.993236 618 39 1 165 8.1
6 53b36289498e528b6cad6624 Forno Rosso Pizza Place 40.694437 -73.983442 279 10 2 46 8.6
7 5946d138e2da1964625ca9b2 Daigo Hand Roll Bar Japanese Restaurant 40.691259 -73.982603 569 3 1 36 8.5
8 594416bca4ba7c683f5492df A Taste Of Katz's Sandwich Place 40.691378 -73.982426 567 27 1 96 8.4
9 50478b43e4b05a8c89d8198c Sophies Cuban Cuisine Cuban Restaurant 40.690602 -73.987700 550 66 2 140 8.4
10 479ccb47f964a5206b4d1fe3 Iron Chef House Japanese Restaurant 40.697406 -73.992560 558 32 1 140 9.2
11 5949c93c3ba767778ec8fad9 Eight Turn Crêpe Creperie 40.691186 -73.982831 565 3 2 10 8.7
12 52b204c0498efe3c3be32d91 Burrow Bakery 40.702579 -73.986702 793 2 2 13 8.1
13 5c56083dc58ed7002c3255d0 Charm Kao Thai Restaurant 40.689445 -73.986585 669 45 2 92 8.5
14 5c3ebf15c0f163002ce6db6f CAVA Mediterranean Restaurant 40.692461 -73.988529 376 19 1 82 9.3
15 4ac00013f964a520689320e3 Queen Italian Restaurant 40.691319 -73.991647 635 22 2 207 9.0
16 55664aac498e10d4e5769e5e Damascus Bread & Pastry Shop Bakery 40.690047 -73.993054 819 80 3 420 9.0
17 583c7fa58cfe547612b47b93 Westville DUMBO American Restaurant 40.702021 -73.989596 776 114 2 416 9.0
18 54b4026b498e73880a40a8d7 Grand Army Seafood Restaurant 40.688329 -73.986612 793 56 2 172 9.0
19 4e793558aeb79f7dab5c7f6f Sottocasa Pizzeria Pizza Place 40.688534 -73.988947 798 121 3 306 9.2
20 5012043564a4944f5c47738a Dellarocco's Pizza Place 40.694992 -73.995924 799 38 1 114 7.8
21 4dc5a9c8e4cd169dc6532c66 Shelsky's of Brooklyn Bagel Shop 40.689486 -73.992518 838 112 3 550 9.0
22 4dcc0c3052b17cba4fb62352 Five Guys Burger Joint 40.693676 -73.986087 200 31 2 87 8.4
23 530931a5498e4079544a5f13 French Louie French Restaurant 40.688056 -73.988159 836 12 2 81 8.8
24 3fd66200f964a520eae81ee3 Henry's End New American Restaurant 40.699610 -73.991983 656 87 2 219 8.4
25 550c4a17498e1d6612e9dffc Bread & Spread Sandwich Place 40.702541 -73.987082 790 95 2 333 9.0
26 3fd66200f964a520ece81ee3 Noodle Pudding Italian Restaurant 40.699715 -73.991769 651 97 1 300 8.8
27 43eb685af964a520382f1fe3 Yemen Cafe Middle Eastern Restaurant 40.690026 -73.993579 851 12 1 89 9.1
28 51b0ceff2d4e2ee801fd4ab2 Foragers Deli / Bodega 40.702468 -73.988569 800 254 3 691 9.1
29 5942d5c94c9be62191b036fa sweetgreen Salad Place 40.702914 -73.989830 877 48 1 100 8.4
30 49d3773df964a520fb5b1fe3 Vinegar Hill House American Restaurant 40.702686 -73.981260 916 9 1 31 8.1
31 4a635ce2f964a520dbc41fe3 Lassen & Hennigs Deli / Bodega 40.694970 -73.994857 710 310 2 737 8.8
32 59456df7a4ba7c683f19f0e7 Café D'Avignon Café 40.691250 -73.982786 562 63 1 133 8.7
33 4b5e4c9df964a520648829e3 Mile End Delicatessen Sandwich Place 40.687549 -73.987018 881 233 3 732 9.2
34 4a748cd9f964a52092de1fe3 Fast and Fresh Burrito Deli Burrito Place 40.687983 -73.986935 832 88 3 193 8.9
35 4cbf097828d176b0a7c6226e Colonie American Restaurant 40.690733 -73.995963 958 60 2 195 8.7
36 49e63b62f964a52027641fe3 Ki Sushi Sushi Restaurant 40.687574 -73.989936 925 92 2 211 8.1
37 49f9dd95f964a5208d6d1fe3 Wild Ginger Vegetarian / Vegan Restaurant 40.687898 -73.989730 885 173 1 532 8.8
38 49f3261bf964a520696a1fe3 Clark's Restaurant Diner 40.697533 -73.993044 601 6 2 11 8.7
39 4e1e23631838bed7eb6a8792 Bien Cuit Bakery 40.687635 -73.989864 916 6 2 15 8.1
40 4f3265a119836c91c7d3c6f7 Hadramout Restaurant Restaurant 40.689818 -73.993877 886 7 2 34 9.2
41 5c17e5658c35dc002cdc77e5 Chicks Isan Thai Restaurant 40.690758 -73.983387 584 9 1 23 8.6
42 5a0740a3e1f22816d11723d5 Lillo Italian Restaurant 40.690200 -73.996540 1032 301 2 979 9.1
43 585d72449f25836f2b2b7a1b Xifu Food Chinese Restaurant 40.688027 -73.982088 905 77 2 152 8.8
44 50ca6337e4b04e1f3135689c Juliana's Pizza Pizza Place 40.702769 -73.993616 1013 152 2 460 9.2
45 49e13617f964a520a9611fe3 One Girl Cookies Bakery 40.687404 -73.990287 952 11 1 52 9.0
46 48a41073f964a52091511fe3 Hibino Japanese Restaurant 40.690076 -73.996497 1037 22 1 73 7.7
47 5b7c4cb3c58ed7002c1fd9bd Butler Bakeshop Café 40.703295 -73.992526 1011 44 2 163 8.6
48 53cfc343498e6ad2a8cc30f3 Pio Bagel Bagel Shop 40.692159 -73.986368 367 95 2 292 9.0
49 527d8cfc11d2a61a7c663024 Luzzo's BK Pizza Place 40.690555 -73.995333 926 122 1 343 8.7
50 4be30f9dd27a20a1cd1f915b River Deli Italian Restaurant 40.693713 -73.998435 1028 88 3 209 8.3
51 43164480f964a52065271fe3 Bedouin Tent Middle Eastern Restaurant 40.686936 -73.984469 963 4 2 18 8.0
52 4471bf9af964a5209c331fe3 Jack the Horse Tavern American Restaurant 40.699940 -73.993639 784 38 1 83 8.0
53 5945bcb842d8c24270976841 Likkle More Jerk Caribbean Restaurant 40.690743 -73.983388 585 26 2 85 8.3
54 4a9c07d1f964a520ca3520e3 Heights Falafel Falafel Restaurant 40.698446 -73.992508 608 65 2 302 9.0
55 57d74b38498e41fcd6c3ee92 The Gumbo Bros Cajun / Creole Restaurant 40.689526 -73.991730 795 195 3 629 8.7
56 593c0d2262420b7feccc3048 Cecconi's Italian Restaurant 40.703893 -73.991609 1034 227 3 844 9.3
57 4f69f2b76d86f87117bb13ab Gran Eléctrica Mexican Restaurant 40.702570 -73.993096 969 168 1 1153 9.1
58 4d9f5a9efc4f721e7e5a9d5f Rucola Italian Restaurant 40.685659 -73.985769 1092 4 1 37 8.2
59 538d22cb498e86974754d6da Shake Shack Burger Joint 40.703043 -73.994276 1071 1 3 1 7.5
60 5b2932a0f5e9d70039787cf2 Los Tacos Al Pastor Taco Place 40.702436 -73.987539 782 6 2 29 8.0
61 580ba89038fa564f8dec12fb Doner Kebab NYC Kebab Restaurant 40.692116 -73.987493 381 186 3 500 8.6
62 5ca3681932b61d0039533330 Pret A Manger French Restaurant 40.693555 -73.985378 230 41 1 84 8.0
63 5330e34e498e36f70456a24b Saketumi Asian Bistro Asian Restaurant 40.694910 -73.994578 687 137 3 472 9.1
64 4ab6d694f964a520407920e3 Henry Public Gastropub 40.690413 -73.996412 1009 144 1 262 8.5
65 4a674aa8f964a5201fc91fe3 La Bagel Delight Bagel Shop 40.691125 -73.991668 652 6 1 35 8.6
66 4f6e5ea9e4b086107787908b La Vara Spanish Restaurant 40.687851 -73.995582 1144 6 1 17 7.8
67 43504680f964a520b0281fe3 Almondine Bakery Bakery 40.703328 -73.991253 964 51 2 192 8.8
68 5b7f53abe727c40024124d8c Oh! Dumplings Dumpling Restaurant 40.687722 -73.992943 1019 57 1 165 8.5
69 595183c718d43b1841112628 Dulcinea Churros & Co. Bakery 40.691364 -73.982421 569 49 1 96 8.1
70 51a3ab412fc69e7654bc731f Luke's Lobster Seafood Restaurant 40.703441 -73.994091 1097 158 4 412 8.9
71 49e28bbcf964a5203a621fe3 Court Street Bagels Bagel Shop 40.688026 -73.993060 996 10 2 72 8.9
72 4ae10c7af964a520db8421e3 My Little Pizzeria Pizza Place 40.690236 -73.992334 763 209 2 438 8.3
73 3fd66200f964a520efe81ee3 The River Café American Restaurant 40.703754 -73.994834 1162 11 2 83 8.2
74 59e7f722829b0c09cfce96d2 Black Forest Brooklyn German Restaurant 40.685613 -73.991097 1163 21 1 88 7.7
75 3fd66200f964a52058f11ee3 Bar Tabac French Restaurant 40.687369 -73.990106 951 165 2 329 8.1
76 51782c87498e1803b70a775f DUMBO Food Truck Lot Food Truck 40.703126 -73.986851 854 18 3 97 8.2
77 56a122d6498eb97a32f045d4 Maison Kayser Bakery 40.692166 -73.991092 535 71 2 186 8.2
78 4db8989a8154ce84dc168e1c two8two Bar & Burger Burger Joint 40.688513 -73.989743 820 15 2 79 8.9
79 572be80d498e66d83ce6f537 Beasts & Bottles New American Restaurant 40.690489 -73.995108 915 169 1 503 8.2
80 4f7f8b86e4b088077df30175 Chez Moi French Restaurant 40.690654 -73.995777 950 7 2 27 7.7
81 58a202a25490d30f87553a08 Rice & Miso Japanese Restaurant 40.684633 -73.983768 1226 66 1 173 7.9
82 4f21bb85e5e872143c0ca04b One Girl Cookies Bakery 40.703317 -73.990694 944 24 1 63 8.9
83 59399dcd12f0a93a5c392d6a Kotti Berliner Döner Kebab Restaurant 40.690662 -73.983533 588 98 3 330 8.1
84 4a5b37fff964a520edba1fe3 La Bagel Delight Bagel Shop 40.702417 -73.988597 795 158 2 309 8.5
85 4a2aebabf964a5206e961fe3 Farmer in the Deli Deli / Bodega 40.693276 -73.971837 1258 4 2 29 7.7
86 51dd7c69498ee00b70fa54b8 Atrium DUMBO American Restaurant 40.703410 -73.990464 947 22 1 44 7.8
87 4112ca00f964a520ed0b1fe3 Joya Thai Restaurant 40.686708 -73.993718 1150 30 2 47 8.0
88 59456723ad910e218eb07f49 Arepa Lady South American Restaurant 40.691182 -73.982785 568 50 2 201 8.4
89 4ab11a2af964a5200b6820e3 Los Papis Latin American Restaurant 40.702087 -73.985044 747 44 2 227 9.3
90 4bafd791f964a5203c253ce3 Mitoushi Sushi Japanese Restaurant 40.690182 -73.993983 864 32 3 179 8.9
91 5250553911d262bb0c732ee8 Cafe Paulette French Restaurant 40.689743 -73.975798 1102 22 2 104 8.2
92 5914d24b3b83070821fcdd87 Miss Ada Israeli Restaurant 40.689560 -73.972351 1360 24 2 79 8.2
93 5739c66c498e7ef6085cec4f Karasu Japanese Restaurant 40.689577 -73.973290 1290 219 2 563 8.2
94 59dd280192e7a97276e4d52f Celestine Mediterranean Restaurant 40.704537 -73.988026 1019 33 1 69 7.9
95 5529a381498e1232f524541f Numero 28 Pizza & Cucina Italian Restaurant 40.686871 -73.990721 1020 3 1 15 7.6
96 49ba5f96f964a52060531fe3 Building on Bond American Restaurant 40.686502 -73.985150 1003 79 2 340 8.8

Let's add these places to the brooklyn map and clusterize them thereafter.

In [40]:
address = 'Brooklyn, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Brooklyn are {}, {}.'.format(latitude, longitude))
The geograpical coordinate of Brooklyn are 40.6501038, -73.9495823.
In [41]:
# create map of Brooklyn using latitude and longitude values
map_brooklyn_1 = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(nb['Venue Latitude'], nb['Venue Longitude'], nb['Venue']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brooklyn_1)  


#add New York City College of Technology to the map
label = folium.Popup('New York City College of Technology', parse_html=True)
folium.CircleMarker(
        [40.695457, -73.9864678851903],
        radius=20,
        popup=label,
        color='dark green',
        fill=True,
        fill_color='dark green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brooklyn_1)     
    
map_brooklyn_1
Out[41]:

Okay, that's a good sign since most of the venues are on the opposite side of the river. Let's clusterize them.

In [44]:
venues_cluster = nb_merged.iloc[:,5:]


se = StandardScaler()


X_1 = se.fit_transform(venues_cluster)

# k means determine k
distortions = []
K = range(1,10)
for k in K:
    kmeanModel = KMeans(n_clusters=k).fit(X_1)
    kmeanModel.fit(X_1)
    distortions.append(sum(np.min(cdist(X_1, kmeanModel.cluster_centers_, 'euclidean'), axis=1)) / X_1.shape[0])

# Plot the elbow
plt.plot(K, distortions, 'bx-')
plt.xlabel('k')
plt.ylabel('Distortion')
plt.title('The Elbow Method showing the optimal k')
plt.show()
/usr/local/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/data.py:645: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  return self.partial_fit(X, y)
/usr/local/anaconda3/lib/python3.7/site-packages/sklearn/base.py:464: DataConversionWarning: Data with input dtype int64, float64 were all converted to float64 by StandardScaler.
  return self.fit(X, **fit_params).transform(X)
In [51]:
# let's stick with 3 clusters

kclusters = 3

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(X_1)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
Out[51]:
array([0, 2, 2, 2, 2, 2, 2, 2, 2, 2], dtype=int32)
In [52]:
pd.Series(kmeans.labels_).value_counts()
Out[52]:
2    42
1    29
0    26
dtype: int64
In [53]:
nb_merged['cluster'] = kmeans.labels_
nb_merged.head()
Out[53]:
ID_x Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology Tips_count Price tier Likes_count Rating cluster
0 594277b586f4cc0f251fc389 DeKalb Market Hall Food Court 40.691250 -73.982579 571 96 2 731 9.3 0
1 59da9590e1f0aa52976b8f35 Han Dynasty Chinese Restaurant 40.691334 -73.982456 570 30 1 145 8.8 2
2 5a87210eda5ede7ae86f82c6 Korilla BBQ Asian Restaurant 40.693803 -73.985843 191 4 2 24 8.0 2
3 53d1a481498e1fac7c37f675 Pollo d'Oro Peruvian Restaurant 40.694823 -73.983373 270 15 2 37 8.1 2
4 59a3040fda708024d6926771 Brooklyn Bridge Bistro And Winebar Bistro 40.696272 -73.988343 182 15 2 38 8.8 2

Add clesters labels to the map

In [54]:
# create map
map_clusters_1 = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cat, cluster in zip(nb_merged['Venue Latitude'], nb_merged['Venue Longitude'], nb_merged['Venue'], nb_merged['Venue Category'] ,  nb_merged['cluster']):
    label = folium.Popup(str(poi) + "_" + str(cat) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_1)

#add New York City College of Technology to the map   
label = folium.Popup('New York City College of Technology', parse_html=True)
folium.CircleMarker(
        [40.695457, -73.9864678851903],
        radius=20,
        popup=label,
        color='dark green',
        fill=True,
        fill_color='dark green',
        fill_opacity=0.7,
        parse_html=False).add_to(map_clusters_1)    
map_clusters_1
Out[54]:

Quick analysis of the venues clusters

Cluster 0

In [56]:
nb_merged[nb_merged['cluster']  == 0].groupby('Venue Category')['Venue'].count().sort_values(ascending = False)
Out[56]:
Venue Category
Pizza Place                      3
Italian Restaurant               2
Bagel Shop                       2
Deli / Bodega                    2
American Restaurant              2
Sandwich Place                   2
New American Restaurant          1
Mexican Restaurant               1
Kebab Restaurant                 1
Japanese Restaurant              1
Seafood Restaurant               1
French Restaurant                1
Food Court                       1
Restaurant                       1
Cajun / Creole Restaurant        1
Burrito Place                    1
Bakery                           1
Asian Restaurant                 1
Vegetarian / Vegan Restaurant    1
Name: Venue, dtype: int64

From here, we exclude several categories: Pizza place, Bagel Shop, Kebab, Seafood, Bakery, Sandwich since they're not our competitors.

In [77]:
nb_merged.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 97 entries, 0 to 96
Data columns (total 11 columns):
ID_x                                           97 non-null object
Venue                                          97 non-null object
Venue Category                                 97 non-null object
Venue Latitude                                 97 non-null float64
Venue Longitude                                97 non-null float64
Distance from the NYC College of Technology    97 non-null int64
Tips_count                                     97 non-null int64
Price tier                                     97 non-null int64
Likes_count                                    97 non-null int64
Rating                                         97 non-null float64
cluster                                        97 non-null int32
dtypes: float64(3), int32(1), int64(4), object(3)
memory usage: 8.7+ KB
In [79]:
nb_merged[(nb_merged['cluster']  == 0) & (~(nb_merged['Venue Category'].isin(['Pizza Place','Bagel Shop', 
                                                                                 'Kebab Restaurant', 
                                                                                 'Seafood Restaurant',
                                                                                 'Sandwich Place'])))]
Out[79]:
ID_x Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology Tips_count Price tier Likes_count Rating cluster
0 594277b586f4cc0f251fc389 DeKalb Market Hall Food Court 40.691250 -73.982579 571 96 2 731 9.3 0
16 55664aac498e10d4e5769e5e Damascus Bread & Pastry Shop Bakery 40.690047 -73.993054 819 80 3 420 9.0 0
17 583c7fa58cfe547612b47b93 Westville DUMBO American Restaurant 40.702021 -73.989596 776 114 2 416 9.0 0
28 51b0ceff2d4e2ee801fd4ab2 Foragers Deli / Bodega 40.702468 -73.988569 800 254 3 691 9.1 0
31 4a635ce2f964a520dbc41fe3 Lassen & Hennigs Deli / Bodega 40.694970 -73.994857 710 310 2 737 8.8 0
34 4a748cd9f964a52092de1fe3 Fast and Fresh Burrito Deli Burrito Place 40.687983 -73.986935 832 88 3 193 8.9 0
37 49f9dd95f964a5208d6d1fe3 Wild Ginger Vegetarian / Vegan Restaurant 40.687898 -73.989730 885 173 1 532 8.8 0
42 5a0740a3e1f22816d11723d5 Lillo Italian Restaurant 40.690200 -73.996540 1032 301 2 979 9.1 0
55 57d74b38498e41fcd6c3ee92 The Gumbo Bros Cajun / Creole Restaurant 40.689526 -73.991730 795 195 3 629 8.7 0
56 593c0d2262420b7feccc3048 Cecconi's Italian Restaurant 40.703893 -73.991609 1034 227 3 844 9.3 0
57 4f69f2b76d86f87117bb13ab Gran Eléctrica Mexican Restaurant 40.702570 -73.993096 969 168 1 1153 9.1 0
63 5330e34e498e36f70456a24b Saketumi Asian Bistro Asian Restaurant 40.694910 -73.994578 687 137 3 472 9.1 0
75 3fd66200f964a52058f11ee3 Bar Tabac French Restaurant 40.687369 -73.990106 951 165 2 329 8.1 0
79 572be80d498e66d83ce6f537 Beasts & Bottles New American Restaurant 40.690489 -73.995108 915 169 1 503 8.2 0
83 59399dcd12f0a93a5c392d6a Kotti Berliner Döner Kebab Restaurant 40.690662 -73.983533 588 98 3 330 8.1 0
93 5739c66c498e7ef6085cec4f Karasu Japanese Restaurant 40.689577 -73.973290 1290 219 2 563 8.2 0
96 49ba5f96f964a52060531fe3 Building on Bond American Restaurant 40.686502 -73.985150 1003 79 2 340 8.8 0

Cluster 1

In [58]:
nb_merged[nb_merged['cluster']  == 1].groupby('Venue Category')['Venue'].count().sort_values(ascending = False)
Out[58]:
Venue Category
American Restaurant          4
Italian Restaurant           3
Bakery                       2
Japanese Restaurant          2
French Restaurant            2
Gastropub                    1
Burger Joint                 1
Café                         1
Deli / Bodega                1
Dumpling Restaurant          1
Food Truck                   1
Thai Restaurant              1
German Restaurant            1
Taco Place                   1
Mediterranean Restaurant     1
Middle Eastern Restaurant    1
Pizza Place                  1
Salad Place                  1
Spanish Restaurant           1
Sushi Restaurant             1
Israeli Restaurant           1
Name: Venue, dtype: int64

From here, we exclude several categories: Bakery, Pizza place, Burger Joint, Sushi Restaurant, since they're not our competitors.

In [82]:
nb_merged[(nb_merged['cluster']  == 1) & (~(nb_merged['Venue Category'].isin(['Bakery', 'Pizza Place', 'Burger Joint', 'Sushi Restaurant'])))]
Out[82]:
ID_x Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology Tips_count Price tier Likes_count Rating cluster
29 5942d5c94c9be62191b036fa sweetgreen Salad Place 40.702914 -73.989830 877 48 1 100 8.4 1
30 49d3773df964a520fb5b1fe3 Vinegar Hill House American Restaurant 40.702686 -73.981260 916 9 1 31 8.1 1
46 48a41073f964a52091511fe3 Hibino Japanese Restaurant 40.690076 -73.996497 1037 22 1 73 7.7 1
47 5b7c4cb3c58ed7002c1fd9bd Butler Bakeshop Café 40.703295 -73.992526 1011 44 2 163 8.6 1
50 4be30f9dd27a20a1cd1f915b River Deli Italian Restaurant 40.693713 -73.998435 1028 88 3 209 8.3 1
51 43164480f964a52065271fe3 Bedouin Tent Middle Eastern Restaurant 40.686936 -73.984469 963 4 2 18 8.0 1
52 4471bf9af964a5209c331fe3 Jack the Horse Tavern American Restaurant 40.699940 -73.993639 784 38 1 83 8.0 1
58 4d9f5a9efc4f721e7e5a9d5f Rucola Italian Restaurant 40.685659 -73.985769 1092 4 1 37 8.2 1
60 5b2932a0f5e9d70039787cf2 Los Tacos Al Pastor Taco Place 40.702436 -73.987539 782 6 2 29 8.0 1
64 4ab6d694f964a520407920e3 Henry Public Gastropub 40.690413 -73.996412 1009 144 1 262 8.5 1
66 4f6e5ea9e4b086107787908b La Vara Spanish Restaurant 40.687851 -73.995582 1144 6 1 17 7.8 1
68 5b7f53abe727c40024124d8c Oh! Dumplings Dumpling Restaurant 40.687722 -73.992943 1019 57 1 165 8.5 1
73 3fd66200f964a520efe81ee3 The River Café American Restaurant 40.703754 -73.994834 1162 11 2 83 8.2 1
74 59e7f722829b0c09cfce96d2 Black Forest Brooklyn German Restaurant 40.685613 -73.991097 1163 21 1 88 7.7 1
76 51782c87498e1803b70a775f DUMBO Food Truck Lot Food Truck 40.703126 -73.986851 854 18 3 97 8.2 1
80 4f7f8b86e4b088077df30175 Chez Moi French Restaurant 40.690654 -73.995777 950 7 2 27 7.7 1
81 58a202a25490d30f87553a08 Rice & Miso Japanese Restaurant 40.684633 -73.983768 1226 66 1 173 7.9 1
85 4a2aebabf964a5206e961fe3 Farmer in the Deli Deli / Bodega 40.693276 -73.971837 1258 4 2 29 7.7 1
86 51dd7c69498ee00b70fa54b8 Atrium DUMBO American Restaurant 40.703410 -73.990464 947 22 1 44 7.8 1
87 4112ca00f964a520ed0b1fe3 Joya Thai Restaurant 40.686708 -73.993718 1150 30 2 47 8.0 1
91 5250553911d262bb0c732ee8 Cafe Paulette French Restaurant 40.689743 -73.975798 1102 22 2 104 8.2 1
92 5914d24b3b83070821fcdd87 Miss Ada Israeli Restaurant 40.689560 -73.972351 1360 24 2 79 8.2 1
94 59dd280192e7a97276e4d52f Celestine Mediterranean Restaurant 40.704537 -73.988026 1019 33 1 69 7.9 1
95 5529a381498e1232f524541f Numero 28 Pizza & Cucina Italian Restaurant 40.686871 -73.990721 1020 3 1 15 7.6 1

Cluster 2

In [60]:
nb_merged[nb_merged['cluster']  == 2].groupby('Venue Category')['Venue'].count().sort_values(ascending = False)
Out[60]:
Venue Category
Bakery                       5
Japanese Restaurant          3
Bagel Shop                   3
Thai Restaurant              2
Burger Joint                 2
Chinese Restaurant           2
Italian Restaurant           2
French Restaurant            2
Pizza Place                  2
New American Restaurant      1
Restaurant                   1
Asian Restaurant             1
South American Restaurant    1
Seafood Restaurant           1
Bistro                       1
Sandwich Place               1
Café                         1
Caribbean Restaurant         1
Creperie                     1
Middle Eastern Restaurant    1
Cuban Restaurant             1
Diner                        1
Falafel Restaurant           1
Sushi Restaurant             1
Peruvian Restaurant          1
Latin American Restaurant    1
Mediterranean Restaurant     1
American Restaurant          1
Name: Venue, dtype: int64

From here, we exclude several categories: Bakery, Bagel Shop, Pizza place, Burger Joint, Seafood Restaurant , Sushi Restaurant, Restaurant since they're not our competitors.

In [83]:
nb_merged[(nb_merged['cluster']  == 1) & (~(nb_merged['Venue Category'].isin(['Bakery', 'Pizza Place', 
                                                                              'Burger Joint', 'Sushi Restaurant',
                                                                             'Bagel Shop', 'Seafood Restaurant', 'Restaurant' ])))]
Out[83]:
ID_x Venue Venue Category Venue Latitude Venue Longitude Distance from the NYC College of Technology Tips_count Price tier Likes_count Rating cluster
29 5942d5c94c9be62191b036fa sweetgreen Salad Place 40.702914 -73.989830 877 48 1 100 8.4 1
30 49d3773df964a520fb5b1fe3 Vinegar Hill House American Restaurant 40.702686 -73.981260 916 9 1 31 8.1 1
46 48a41073f964a52091511fe3 Hibino Japanese Restaurant 40.690076 -73.996497 1037 22 1 73 7.7 1
47 5b7c4cb3c58ed7002c1fd9bd Butler Bakeshop Café 40.703295 -73.992526 1011 44 2 163 8.6 1
50 4be30f9dd27a20a1cd1f915b River Deli Italian Restaurant 40.693713 -73.998435 1028 88 3 209 8.3 1
51 43164480f964a52065271fe3 Bedouin Tent Middle Eastern Restaurant 40.686936 -73.984469 963 4 2 18 8.0 1
52 4471bf9af964a5209c331fe3 Jack the Horse Tavern American Restaurant 40.699940 -73.993639 784 38 1 83 8.0 1
58 4d9f5a9efc4f721e7e5a9d5f Rucola Italian Restaurant 40.685659 -73.985769 1092 4 1 37 8.2 1
60 5b2932a0f5e9d70039787cf2 Los Tacos Al Pastor Taco Place 40.702436 -73.987539 782 6 2 29 8.0 1
64 4ab6d694f964a520407920e3 Henry Public Gastropub 40.690413 -73.996412 1009 144 1 262 8.5 1
66 4f6e5ea9e4b086107787908b La Vara Spanish Restaurant 40.687851 -73.995582 1144 6 1 17 7.8 1
68 5b7f53abe727c40024124d8c Oh! Dumplings Dumpling Restaurant 40.687722 -73.992943 1019 57 1 165 8.5 1
73 3fd66200f964a520efe81ee3 The River Café American Restaurant 40.703754 -73.994834 1162 11 2 83 8.2 1
74 59e7f722829b0c09cfce96d2 Black Forest Brooklyn German Restaurant 40.685613 -73.991097 1163 21 1 88 7.7 1
76 51782c87498e1803b70a775f DUMBO Food Truck Lot Food Truck 40.703126 -73.986851 854 18 3 97 8.2 1
80 4f7f8b86e4b088077df30175 Chez Moi French Restaurant 40.690654 -73.995777 950 7 2 27 7.7 1
81 58a202a25490d30f87553a08 Rice & Miso Japanese Restaurant 40.684633 -73.983768 1226 66 1 173 7.9 1
85 4a2aebabf964a5206e961fe3 Farmer in the Deli Deli / Bodega 40.693276 -73.971837 1258 4 2 29 7.7 1
86 51dd7c69498ee00b70fa54b8 Atrium DUMBO American Restaurant 40.703410 -73.990464 947 22 1 44 7.8 1
87 4112ca00f964a520ed0b1fe3 Joya Thai Restaurant 40.686708 -73.993718 1150 30 2 47 8.0 1
91 5250553911d262bb0c732ee8 Cafe Paulette French Restaurant 40.689743 -73.975798 1102 22 2 104 8.2 1
92 5914d24b3b83070821fcdd87 Miss Ada Israeli Restaurant 40.689560 -73.972351 1360 24 2 79 8.2 1
94 59dd280192e7a97276e4d52f Celestine Mediterranean Restaurant 40.704537 -73.988026 1019 33 1 69 7.9 1
95 5529a381498e1232f524541f Numero 28 Pizza & Cucina Italian Restaurant 40.686871 -73.990721 1020 3 1 15 7.6 1
In [ ]:
 

From the analysis above we can conclude that:

  • cluster 0 - cheap and mid range venues with low to moderate tips count and high middle rating, close vicinity (till 800-900 m), mostly Bakery and Asian venues
  • cluster 1 - cheap to mid range venues with moderate tips count, moderate likes count, and middle rating, middle distance (800-1300 m) mostly American and European restaurants
  • cluster 2 - expensive venues with middle to huge tips count, likes count, and high rating, middle distance(700-1000 m), mostly American and European restaurants

Since our restaurant is going to be vegetarian and middle range income, our direct competitors would be mostly cluster 0 venues and possibly cluster 1, even if they're further located. There's even one direct competitor, another vegie venue - Wild Ginger, in cluster 2, located about 900 m from the college. But it's a more expensive venue, which can be of advantage to us.

Let's see our top 20 competitors:

In [94]:
comp = nb_merged[((nb_merged['cluster']  == 0) | (nb_merged['cluster']  == 1)) 
          & (~(nb_merged['Venue Category'].isin(['Bakery', 'Pizza Place', 
                                                 'Burger Joint', 'Sushi Restaurant',
                                                 'Bagel Shop', 'Seafood Restaurant', 
                                                 'Restaurant','Kebab Restaurant', 'Sandwich Place' ])))][['Venue', 'Venue Category',
                                        'Distance from the NYC College of Technology',
                                        'Tips_count',
                                        'Price tier', 
                                        'Rating' ] ].nlargest(20, columns = 'Rating').reset_index(drop = True)

comp
Out[94]:
Venue Venue Category Distance from the NYC College of Technology Tips_count Price tier Rating
0 DeKalb Market Hall Food Court 571 96 2 9.3
1 Cecconi's Italian Restaurant 1034 227 3 9.3
2 Foragers Deli / Bodega 800 254 3 9.1
3 Lillo Italian Restaurant 1032 301 2 9.1
4 Gran Eléctrica Mexican Restaurant 969 168 1 9.1
5 Saketumi Asian Bistro Asian Restaurant 687 137 3 9.1
6 Westville DUMBO American Restaurant 776 114 2 9.0
7 Fast and Fresh Burrito Deli Burrito Place 832 88 3 8.9
8 Lassen & Hennigs Deli / Bodega 710 310 2 8.8
9 Wild Ginger Vegetarian / Vegan Restaurant 885 173 1 8.8
10 Building on Bond American Restaurant 1003 79 2 8.8
11 The Gumbo Bros Cajun / Creole Restaurant 795 195 3 8.7
12 Butler Bakeshop Café 1011 44 2 8.6
13 Henry Public Gastropub 1009 144 1 8.5
14 Oh! Dumplings Dumpling Restaurant 1019 57 1 8.5
15 sweetgreen Salad Place 877 48 1 8.4
16 River Deli Italian Restaurant 1028 88 3 8.3
17 Rucola Italian Restaurant 1092 4 1 8.2
18 The River Café American Restaurant 1162 11 2 8.2
19 DUMBO Food Truck Lot Food Truck 854 18 3 8.2

Metro stations

Let's look for the nearby metro stations around the college

In [62]:
ny_metro_stations = pd.read_csv('ny_data/ny_metro_stations.csv')[['stop_name','complex_id','stop_lat','stop_lon' ]].dropna()

brooklyn_metro_stations = ny_metro_stations[ny_metro_stations['complex_id'].str.startswith('bk')]
brooklyn_metro_stations.head()
Out[62]:
stop_name complex_id stop_lat stop_lon
65 Clark St bk061 40.697466 -73.993086
66 Borough Hall bk068 40.693219 -73.989998
67 Hoyt St bk097 40.690545 -73.985065
68 Nevins St bk123 40.688246 -73.980492
69 Atlantic Av - Barclays Ctr bk026 40.684359 -73.977666
In [63]:
# add markers to map
for lat, lng, label in zip(brooklyn_metro_stations['stop_lat'], brooklyn_metro_stations['stop_lon'], brooklyn_metro_stations['stop_name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='grey',
        fill=True,
        fill_color='grey',
        fill_opacity=0.7,
        parse_html=False).add_to(map_clusters_1)    
        
map_clusters_1
Out[63]:

Excellent! In the vicinity we have three metro stations: Court St., Borough hall, and Jay St MetroTech.

University locations

Now, let's get the data of the New York universities to identify if the chosen starting point is actually part of the educational conglomerate

In [64]:
ny_uni = pd.read_csv('ny_data/nyu_uni.csv')
ny_uni.head()
Out[64]:
X Y id name address zip factype facname capacity capname bcode xcoord ycoord
0 1039079 234351 12000000001304 MARITIME COLLEGE AT FORT SCHUYLER (SUNY) 6 Pennyfield Ave 10465 1201 SUNY - The State University of New York 1799.0 Enrollment 36005 1039079.0 234351.0
1 999458 177948 12000000000085 HEALTH SCIENCE CENTER AT BROOKLYN (SUNY) 450 Clarkson Ave 11203 1201 SUNY - The State University of New York 1865.0 Enrollment 36047 999458.0 177948.0
2 985698 211585 12000000000022 FASHION INSTITUTE OF TECHNOLOGY (SUNY) 227 W 27 St 10001 1201 SUNY - The State University of New York 9764.0 Enrollment 36061 985698.0 211585.0
3 989188 214157 12000000000023 NYS COLLEGE OF OPTOMETRY (SUNY) 33 W 42 St 10036 1201 SUNY - The State University of New York 363.0 Enrollment 36061 989188.0 214157.0
4 1009244 251859 12000000001458 BRONX COMMUNITY COLLEGE (CUNY) 2155 University Ave 10453 1202 CUNY - The City University of New York 11506.0 Enrollment 36005 1009244.0 251859.0
In [65]:
def get_location(address):
    address = address + ', New York'
    try:
        geolocator = Nominatim(user_agent="ny_explorer")
        location = geolocator.geocode(address)
        latitude = location.latitude
        longitude = location.longitude
        results = pd.DataFrame([address,latitude,longitude])
        return latitude, longitude
    except:
        pass
In [66]:
get_uni_location = ny_uni.address.apply(get_location)
In [67]:
ny_uni['latitude'] = [i[0] if i is not None else None for i in get_uni_location]
ny_uni['longitude'] = [i[1] if i is not None else None for i in get_uni_location]

ny_uni = ny_uni.dropna()[['name', 'latitude','longitude']]
In [68]:
# add universities to map
for lat, lng, label in zip(ny_uni['latitude'], ny_uni['longitude'], ny_uni['name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=10,
        popup=label,
        color='yellow',
        fill=True,
        fill_color='yellow',
        fill_opacity=0.7,
        parse_html=False).add_to(map_clusters_1) 
    
map_clusters_1
Out[68]:

Results and Discussion

The purpose of this project has been to investigate and segment the neighbourhoods in Brooklyn. In specific, with respect to opening the vegetarian restaurant in vicinity of NYC College of Technology. The starting point has been taken with respect to proximity to the metro stations, high number of young people who are more likely to consume organic / vegetarian food. The requirement was to open the spot in middle range income area. For this purpose I analyzed a more macro indicator of real estate sales, revealing that our terriroty indeed is a mid income range one. The analysis presented that most of the nearby buildings where flats were sold, belong to the cluster 0, which corresponds to the COOPS - ELEVATOR APARTMENTS, CONDOS - ELEVATOR APARTMENTS and RENTALS - WALKUP APARTMENTS with price between 550,000 USD and 1,200,000 USD. This is the most populous cluster.

Second, i segmented the restaurant venues in proximity of 2 km, in order to reach the optimal distribution / number of places. It turned out that most of the direct competitors would be American and European restaurants plus Pizza places, mid range restaurants. Since our restaurant is going to be vegetarian and middle range income, our direct competitors would be mostly cluster 0 and 1 venues, even if they're further located. There's even one direct competitor, another vegie venue - Wild Ginger, in cluster 2, located about 900 m from the college. But it's a more expensive venue, which can be of advantage to us. The College of Technology indeed appeared to be a part of educational conglomerate as shown on the map, with 5 instituions presented in total, namely: NYC College of Technology, St Francis College,Brooklyn Law School, ASA College, and Institute of Design and Construction.

Therefore, I recommend to locate a new restaurant nearby Jay Street, Adams Street, Fulton Mall, Boroghby street, and Tillary Street. Also, an alternative could be on Remsen street, near the St Francis College. It's close to the subway station at Court St., and has no many competitors too close.

Conclusion

In the end, the possible locations for a new vegetarian restaurant in Brooklyn are nearby Jay Street, Adams Street, Fulton Mall, Boroghby street, and Tillary Street. These streets present a good commute options, not far from the universities. Some other smaller streets in vicinity such as Chapel st. could be taken as well, however we must make sure it's not directly near the cathedral like there.

The project has been quite a challenge, however i am glad i finished it. Hope that my findings at least present a healthy way of analysis and are a good starting point for a more indepth analysis to the potential investor who indeed would open a restaurant like this in Brooklyn.

The project necesaarily had certain limitations which are related with the lack of the current data, for example, on the real estate sales. And in a commercial project some more thourough investigation would be necessary.

In [ ]: